next up previous
Next: Hardware for Window Up: Transforming Geometry Previous: Pipelining and Parallelism

Types of Parallelism

There are two common ways to implement micro-coded parallel processors: lock-step or independent processors. Independent processors let each parallel processor execute a different sequence of instructions. In the bank teller example, this might amount to one teller processing a deposit and another teller processing a withdrawal. In graphics (unlike a bank), it is often true that the last tasks are very similar to the current and immediately following tasks. For example, a triangle to be rendered is likely to be followed by several more triangles to be rendered the same way, but with different coordinates. Parallel processors can take advantage of the homogenous nature of graphics calculations by running in lock-step. Each graphics processor working on one of a set of triangles of the exact same type can execute the identical transformation, lighting, and clip testing micro-code, but execute the code on the coordinates of its triangle.

The advantage of lock-step execution is that a single micro-code memory store is necessary and the complexity of programming lock-step processors can be lower than programming independent processors. A single micro-code memory store is generally cheaper than having an independent micro-code store per processor. Lock-step execution works well when the workload is homogenous. If eight lock-step processors can all be working on different triangles of the same type, theoretically the triangles can be transformed eight times faster than a single processor. But, if there is an OpenGL mode change so that each triangle is not of the same type, the advantage of lock-step parallelism breaks down.

Lock-step processors are called SIMD processors, meaning they have a Single Instruction stream for Multiple Data streams. Likewise, independent processors are referred to as MIMD processors, meaning they have Multiple Instruction streams for Multiple Data streams. The terms MIMD and SIMD are often used to describe the architectures of geometry transformation hardware. If other performance factors are equal (they never are!), a SIMD architecture is likely to be less expensive than a MIMD architecture, though the MIMD architecture is potentially better at less homogeneous workloads.

In a pipelined or parallel system, an N-stage pipeline or N-way parallelism does not necessarily imply a speedup of N times. When the state within a hardware pipeline changes, it is likely to require the introduction of stalls into the pipeline. These stalls undermine performance since all the stages are not kept busy with work. Likewise, for a parallel system, there may not be enough work to keep all N processors busy or there may be communication overhead necessary to coordinate the parallel processors.



next up previous
Next: Hardware for Window Up: Transforming Geometry Previous: Pipelining and Parallelism



mjk@sgi.com